Grokking Database Fundamentals for Tech Interviews
Rate this course
Ask Author
Contribute
Back to course home

0% completed

Vote For New Content
Introduction to Database Systems
Course Overview
Overview of Databases
Core Components of Database
Database Architectures
Relational Vs. Non-relational Databases
Data Storage in DBMS
Data Storage Fundamentals
File Organization in DBMS
Data Compression and Encoding
Quiz
Indexing in DBMS
Introduction to Database Indexes
Single Level Indexing
Tree-Based Indexing Techniques
Hash-Based Indexing
Bloom Filters
Quiz
Transaction Management
Understanding the Transaction
Atomicity
Consistency
Isolation
Durability
Quiz
Concurrency Control
Understanding the Concurrency Control
Optimistic vs. Pessimistic Concurrency Control
Lock-Based Concurrency Control
Timestamp-Based Concurrency Control
Multiversion Concurrency Control (MVCC)
Quiz
Distributed Databases, Data Partitioning & Sharding
Fundamentals of Distributed Databases
Partitioning in Databases
Partitioning Types
Sharding in Databases
Distributed Query Processing
Quiz
Data Replication and Consistency Models
Replication in Databases
Replication Topologies
Understanding the Consistency Models in Databases
The CAP Theorem and Its Implications
Quiz
Consenus and Leader Election
Understanding the Consensus
Consensus algorithms
Leader Election Strategies
Quiz
Recovery and Fault Tolerance in Distributed Databases
Understanding the Fault Tolerance
Logging Mechanisms in Databases
Checkpointing and Recovery Basics
Coordinated Recovery in Distributed Systems
High Availability Through Replication and Failover
Ensuring Data Resilience: Redundancy, Mirroring, and Recovery
Quiz
NoSQL Databases and New Data Models
Introduction to NoSQL
Key-Value Stores
Document-Oriented Databases
Column-Family and Wide-Column Stores
Graph Databases
Quiz
Quora System Design & Database Design
TikTok System Design & Database Design
Airbnb System Design & Database Design
Snapchat System Design & Database Design
Introduction to NoSQL

The explosion of big data and the demand for high-performance applications have driven the adoption of NoSQL databases. Unlike traditional relational databases, NoSQL systems handle unstructured and semi-structured data while offering unmatched flexibility and scalability. They use dynamic schemas and distributed architectures, enabling modern web applications to scale seamlessly.

NoSQL databases cater to diverse requirements, from real-time analytics to complex relationship modeling. Their ability to support changing data structures and handle massive data volumes makes them indispensable in today’s data-driven world.

Image

Motivation Behind NoSQL

Challenges with Relational Databases

  1. Rigid Schema Design: Relational databases require predefined schemas, making it difficult to handle dynamic or evolving data structures.
  2. Scalability Constraints: Relational databases scale vertically, demanding more powerful hardware, which is costly and limited in capacity.
  3. Performance Bottlenecks: Complex joins and transactions can degrade performance as data grows.

Why NoSQL?

  1. Flexible Data Models: NoSQL databases accommodate unstructured or semi-structured data, such as JSON or XML, making them ideal for modern applications.
  2. Horizontal Scalability: They scale out by adding more servers, efficiently managing large volumes of data and high traffic.
  3. High Performance: Optimized for specific use cases, NoSQL databases ensure faster reads and writes by eliminating the need for complex joins.

Example Use Case:

  • An e-commerce platform using MongoDB can store diverse product data in JSON-like documents, avoiding rigid table structures and enabling rapid changes to the data model.

CAP Theorem and NoSQL

The CAP theorem explains the trade-offs in distributed systems between Consistency, Availability, and Partition Tolerance. According to the theorem:

  • Consistency (C): All nodes see the same data simultaneously.
  • Availability (A): Every request receives a response, regardless of success or failure.
  • Partition Tolerance (P): The system continues to operate despite network partitions.

In a distributed environment, it is impossible to achieve all three simultaneously. Systems must prioritize two, leading to trade-offs:

  • CP Systems: Prioritize Consistency and Partition Tolerance but may sacrifice availability during network issues.
    • Example: HBase
  • AP Systems: Prioritize Availability and Partition Tolerance but allow temporary inconsistencies.
    • Example: Cassandra

BASE Properties

To address the limitations highlighted by the CAP theorem, NoSQL databases often follow the BASE model. Here, BASE is an acronym for Basically Available, Soft state, Eventual consistency.

Image

Basically Available

The system ensures availability for most requests. Even during partial failures, it provides a response, although the data might not be the latest.

Example:

  • Redis, a key-value store, serves requests even if some nodes are down by routing them to available replicas.

Soft State

Data stored in the system is not fixed or immediately consistent across replicas. It may change due to eventual updates.

Example:

  • In MongoDB, updates to a document may propagate asynchronously, causing temporary discrepancies between replicas.

Eventual Consistency

All nodes eventually converge to the same state, given enough time and no new updates. This ensures consistency over the long term.

Example:

  • In Cassandra, a write operation is propagated to all nodes asynchronously. While some nodes might have outdated data temporarily, they eventually reflect the same state.

How BASE Solves CAP Problems

BASE properties align with the AP model in the CAP theorem, prioritizing Availability and Partition Tolerance while allowing Eventual Consistency. This trade-off suits applications requiring high availability, such as social media platforms and real-time analytics.

ACID vs. BASE

The BASE model contrasts sharply with the ACID properties of relational databases. Here’s a comparison:

PropertyACID (Relational)BASE (NoSQL)
AtomicityTransactions are all-or-nothing.No strict guarantees; partial failures possible.
ConsistencyEnsures strong consistency after transactions.Eventual consistency; temporary discrepancies.
IsolationConcurrent transactions do not interfere.May allow some interference.
DurabilityOnce committed, data is permanent.Durability depends on system configuration.
AvailabilityLimited by consistency during failures.High availability prioritized over consistency.
ScalabilityPrimarily vertical scaling.Horizontal scaling with distributed architectures.

Categories of NoSQL Databases

NoSQL databases are categorized into four main types, each optimized for specific use cases:

Image
  1. Key-Value Stores

    • Store data as key-value pairs.
    • Suitable for caching and session management.
    • Example: Redis
  2. Document-Oriented Databases

    • Store data in documents (JSON/BSON) with dynamic schemas.
    • Ideal for content management systems.
    • Example: MongoDB
  3. Column-Family Stores

    • Organize data into column families for fast queries.
    • Suitable for analytics and time-series data.
    • Example: Cassandra
  4. Graph Databases

    • Represent data as nodes and edges to model relationships.
    • Used for social networks and recommendation systems.
    • Example: Neo4j

Key Features of NoSQL

  1. Dynamic Schema: Accommodates changing data structures without migrations.
  2. Horizontal Scalability: Handles growing data by adding servers.
  3. High Performance: Optimized for specific workloads like real-time analytics.
  4. Fault Tolerance: Distributed architectures ensure data availability during failures.

NoSQL databases revolutionize how we store and manage data by addressing the limitations of relational systems. Through BASE properties, they provide scalable, flexible, and highly available solutions tailored for modern applications. By understanding their trade-offs, you can choose the right NoSQL database for your needs.

Quiz
Key-Value Stores
Mark as Completed

On this page

Motivation Behind NoSQL

Challenges with Relational Databases

Why NoSQL?

CAP Theorem and NoSQL

BASE Properties

Basically Available

Soft State

Eventual Consistency

How BASE Solves CAP Problems

ACID vs. BASE

Categories of NoSQL Databases

Key Features of NoSQL